Smooth Boosting and Linear Threshold Learning with Malicious Noise
نویسنده
چکیده
We describe a PAC algorithm for learning linear threshold functions when some fraction of the examples used for learning are generated and labeled by an omniscient malicious adversary. The algorithm has complexity bounds similar to the classical Perceptron algorithm but can tolerate a substantially higher level of malicious noise than Perceptron and thus may be of signiicant practical interest. At the heart of our algorithm is a new boosting procedure which is guaranteed to generate only distributions which are (optimally) smooth. By using this boosting procedure in conjunction with a noise-tolerant weak learning algorithm, our algorithm can learn successfully despite higher rates of malicious noise than previous approaches.
منابع مشابه
Smooth Boosting and Learning with Malicious Noise
We describe a new boosting algorithm which generates only smooth distributions which do not assign too much weight to any single example. We show that this new boosting algorithm can be used to construct efficient PAC learning algorithms which tolerate relatively high rates of malicious noise. In particular, we use the new smooth boosting algorithm to construct malicious noise tolerant versions...
متن کاملLearning Halfspaces with Malicious Noise
We give new algorithms for learning halfspaces in the challenging malicious noise model, where an adversary may corrupt both the labels and the underlying distribution of examples. Our algorithms can tolerate malicious noise rates exponentially larger than previous work in terms of the dependence on the dimension n, and succeed for the fairly broad class of all isotropic log-concave distributio...
متن کاملAttribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions
We consider the well-studied problem of learning decision lists using few examples when many irrelevant features are present. We show that smooth boosting algorithms such as MadaBoost can efficiently learn decision lists of length k over n boolean variables using poly(k, logn) many examples provided that the marginal distribution over the relevant variables is “not too concentrated” in an L2-no...
متن کاملDissertation Research: It isn’t enough to be smooth
The ISIS team in ISR-2 primarily uses supervised learning techniques to solve classification problems in imagery and therefore has a strong interest in finding linear classification algorithms that are both robust and efficient. Boosting algorithms take a principled approach to finding linear classifiers and they has been shown to be so effective in practice that they are widely used in a varie...
متن کاملSmooth Boosting Using an Information-Based Criterion
Smooth boosting algorithms are variants of boosting methods which handle only smooth distributions on the data. They are proved to be noise-tolerant and can be used in the “boosting by filtering” scheme, which is suitable for learning over huge data. However, current smooth boosting algorithms have rooms for improvements: Among non-smooth boosting algorithms, real AdaBoost or InfoBoost, can per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007